A formal theory for optimal and information theoretic syntactic pattern recognition

نویسندگان

  • B. John Oommen
  • Rangasami L. Kashyap
چکیده

In this paper we present a foundational basis for optimal and information theoretic syntactic pattern recognition. We do this by developing a rigorous model, M*, for channels which permit arbitrarily distributed substitution, deletion and insertion syntactic errors. More explicitly, if A is any finite alphabet and A* the set of words over A, we specify a stochastically consistent scheme by which a string U ∈ A* can be transformed into any Y ∈ A* by means of arbitrarily distributed substitution, deletion and insertion operations. The scheme is shown to be Functionally Complete and stochastically consistent. Apart from the synthesis aspects, we also deal with the analysis of such a model and derive a technique by which Pr[Y|U], the probability of receiving Y given that U was transmitted, can be computed in cubic time using dynamic programming. One of the salient features of this scheme is that it demonstrates how dynamic programming can been applied to evaluate quantities involving complex combinatorial expressions and which also maintain rigid probability consistency constraints. Experimental results which involve dictionaries with strings of lengths between 7 and 14 with an overall average noise of 39.75 % demonstrate the superiority of our system over existing methods. Apart from its straightforward applications in string generation and recognition, we believe that the model also has extensive potential applications in speech and unidimensional signal processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

A New Top-Down Context-Free Parsing for Syntactic Pattern Recognition

The numerous different mathematical methods used to solve pattern recognition snags may be assembled into two universal approaches:the decision-theoretic approach and the syntactic (structural) approach. In this paper,at first syntactic pattern recognition method and formal grammars are described and then has been investigated one of the techniques in syntactic pattern recognition called top –d...

متن کامل

Optimal and Information Theoretic Syntactic Pattern Recognition for Traditional Errors

In this paper we present a foundational basis for optimal and information theoretic syntactic pattern recognition. We do this by developing a rigorous model, M*, for channels which permit arbitrarily distributed substitution, deletion and insertion syntactic errors. More explicitly, if A is any finite alphabet and A* the set of words over A, we specify a stochastically consistent scheme by whic...

متن کامل

Peptide classification using optimal and information theoretic syntactic modeling

We consider the problem of classifying peptides using the information residing in their syntactic representations. This problem, which has been studied for more than a decade, has typically been investigated using distance-based metrics that involve the edit operations required in the peptide comparisons. In this paper, we shall demonstrate that the Optimal and Information Theoretic (OIT) model...

متن کامل

On Utilizing Optimal and Information Theoretic Syntactic Modeling for Peptide Classification

Syntactic methods in pattern recognition have been used extensively in bioinformatics, and in particular, in the analysis of gene and protein expressions, and in the recognition and classification of biosequences. These methods are almost universally distance-based. This paper concerns the use of an Optimal and Information Theoretic (OIT) probabilistic model [11] to achieve peptide classificati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 31  شماره 

صفحات  -

تاریخ انتشار 1998